Approximate solutions to constrained risk-sensitive Markov decision processes

نویسندگان

چکیده

This paper considers the problem of finding near-optimal Markovian randomized (MR) policies for finite-state-action, infinite-horizon, constrained risk-sensitive Markov decision processes (CRSMDPs). Constraints are in form standard expected discounted cost functions as well over finite and infinite horizons. We first show that aforementioned CRSMDP optimization possesses a solution if it is feasible (that is, there exists policy which satisfies all constraints). Secondly, we provide two methods an approximate ultimately stationary (US) MR policy. The latter achieved through approximating finite-horizon CRSMDPs constructed from original by time-truncating objective constraint functions, suitably perturbing upper bounds. approximation gives US ϵ-optimal problem, while second whose violation constraints bounded above specified tolerance value ϵ. A key step proofs appropriate choice metric makes set infinite-horizon regions three compact, continuous. also discuss applications use inventory control example to illustrate how existing techniques may be used solve problems mentioned above.

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

More Risk-Sensitive Markov Decision Processes

We investigate the problem of minimizing a certainty equivalent of the total or discounted cost over a finite and an infinite horizon which is generated by a Markov Decision Process (MDP). The certainty equivalent is defined by U−1(EU(Y )) where U is an increasing function. In contrast to a risk-neutral decision maker this optimization criterion takes the variability of the cost into account. I...

متن کامل

Risk-Sensitive Control of Markov Decision Processes

This paper introduces an algorithm to determine near-optimal control laws for Markov Decision Processes with a risk-sensitive criterion. Both the fully observed and the partially observed settings are considered, for nite and innnite horizon formulations. Dynamic programming equations are introduced which characterize the value function for the partially observed, innnite horizon , discounted c...

متن کامل

Approximate Linear Programming for Constrained Partially Observable Markov Decision Processes

In many situations, it is desirable to optimize a sequence of decisions by maximizing a primary objective while respecting some constraints with respect to secondary objectives. Such problems can be naturally modeled as constrained partially observable Markov decision processes (CPOMDPs) when the environment is partially observable. In this work, we describe a technique based on approximate lin...

متن کامل

Constrained Markov Decision Processes

2 i To Tania and Einat ii Preface In many situations in the optimization of dynamic systems, a single utility for the optimizer might not suuce to describe the real objectives involved in the sequential decision making. A natural approach for handling such cases is that of optimization of one objective with constraints on other ones. This allows in particular to understand the tradeoo between t...

متن کامل

Approximate Probabilistic Constraints and Risk-Sensitive Optimization Criteria in Markov Decision Processes

The majority of the work in the area of Markov decision processes has focused on expected values of rewards in the objective function and expected costs in the constraints. Although several methods have been proposed to model risksensitive utility functions and constraints, they are only applicable to certain classes of utility functions and allow limited expressiveness in the constraints. We p...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: European Journal of Operational Research

سال: 2023

ISSN: ['1872-6860', '0377-2217']

DOI: https://doi.org/10.1016/j.ejor.2023.02.039